Room 306, Level 3, Biomedical Building, Australian Technology Park, Eveleigh
Ph: 02 8627 1024
Email: floris.vanogtrop@sydney.edu.au
Teaching schedule
Januar Harianto
Weeks 1 – 4, Lecturer
Floris van Ogtrop
Weeks 5 – 8, Unit Coordinator
Si Yang Han
Weeks 9 – 12, Lecturer
About ENVX1002
Learning outcomes
LO1. Implement basic reproducible research practices – including consistent data organisation, documented code, and version-controlled workflows so that statistical analyses and results can be readily replicated and validated by others.
LO2. Demonstrate proficiency in utilising R and Excel to effectively explore and describe life science datasets.
LO3. Apply parametric and non-parametric statistical inference methods to experimental and observational data using RStudio and effectively interpret and communicate the results in the context of the data.
LO4. Be able to put into practice both linear and non-linear models to describe relationships between variables using RStudio and Excel, demonstrating creativity in developing models that effectively represent complex data patterns.
LO5. Be able to articulate statistical and modelling results clearly and convincingly in both written reports and oral presentations, working effectively as an individual and collaboratively in a team, showcasing the ability to convey complex information to varied audiences.
Delivery format
All lectures and tutorials are held in ABS Lecture Theatre 1130. Lab sessions are held in the Biomedical Building, Australian Technology Park, Eveleigh.
Lectures (recorded): deliver content, provide context, and introduce new concepts
Tutorials (recorded): practice and apply concepts from lectures, prep for labs
Labs: hands-on practice with R and data analysis, with demonstrators to help you
The following are optional (but highly recommended):
Drop-in sessions: additional help and support, mostly on Zoom
Ed discussion: online forum for questions and discussions
Timetable
Lectures(recorded)
Monday 12pm–1pm, ABS Lecture Theatre 1130
Tuesday 9am–10am, ABS Lecture Theatre 1130
Tutorials(recorded)
Tuesday 10am–11am, ABS Lecture Theatre 1130
1-hour tutorial directly following your lecture
Computer Labs
2-hour in-person lab session with demonstrators
Biomedical Building, Australian Technology Park, Eveleigh
See timetable for your allocated time
Schedule at a glance…
Code
sequenceDiagram participant M as Mon participant T as Tue participant W as Wed participant Th as Thu participant F as Fri participant S as Sat participant Su as Sun Note over M,T: Lectures (recorded) - ABS LT 1130 Note over T: Tutorial (recorded) - ABS LT 1130 Note over T,Th: Lab Sessions - Biomedical Building Th->>+Su: Self-revision, pick ONE day (encouraged)
sequenceDiagram
participant M as Mon
participant T as Tue
participant W as Wed
participant Th as Thu
participant F as Fri
participant S as Sat
participant Su as Sun
Note over M,T: Lectures (recorded) - ABS LT 1130
Note over T: Tutorial (recorded) - ABS LT 1130
Note over T,Th: Lab Sessions - Biomedical Building
Th->>+Su: Self-revision, pick ONE day (encouraged)
ENVX-resources – GitHub repository for our open-source materials
Ed Discussion – main platform for ANNOUNCEMENTS and Q&A
Where are the Labs?
Lab sessions include extra time (30 minutes) for travel – already programmed in the timetable (so clashes are avoided)
A free shuttle service is available between campus and the labs, but the schedule is very limited
Take advantage of the new community access gates at Redfern Station: saves 5 minutes
Content & assessments
Topic outline
Week 01 - Data: Reproducible science
Week 02 - Data: Introduction to statistical programming
Week 03 - Data: Exploring and visualising data
Week 04 - Data: The Central Limit Theorem
Week 05 - Inference: 1-sample tests
Week 06 - Inference: 2-sample tests
Week 07 - Inference: Non-parametric tests 1
Week 08 - Inference: Non-parametric tests 2
Week 09 - Modelling: Describing relationships
Week 10 - Modelling: Linear functions
Week 11 - Modelling: Linear functions – multiple predictors
Week 12 - Modelling: Non-linear functions
Week 13 - Revision: Past exam questions and review
Assessments
Code
# calculate this year's year numberlibrary(lubridate)year <-year(Sys.Date())address <-paste0("https://www.sydney.edu.au/units/ENVX1002/", year,"-S1C-ND-CC")
The most up to date (and slightly more comprehensive) information for 2025 is here. In a nutshell:
Week
Assessment
Description
3
Early Feedback Quiz (individual 5%)
In-person - 15 minutes
5
Project 1: Exploring data (individual 10%)
Written report, 500 words
8
Coding and data skills evaluation (individual 15%)
In-person - 50 minutes
13
Project 2: Modelling (10% + Peer assessment 5%)
Group presentation - 5 minutes
Exam
Final exam (individual 45%)
MCQ + SAQ Questions - 2 hours
Week 3: The early feedback quiz is a chance for us to gauge your understanding and provide feedback
Week 8: Coding and data skills evaluation covers R data manipulation and analysis
Final exam will NOT require you to write or interpret code – focus on understanding concepts and interpreting results
Software and tools
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
– John Tukey (1915 – 2000)
Baby steps…
This unit is designed for beginners - no prior statistics or programming required
We start with basics – pace increases after week 4
Focus on understanding concepts first, then tools
We provide plenty of support – more on this later
Our tech stack
MS Excel – for data entry and basic analysis
R – a programming language for data analysis
RStudio – an integrated development environment (IDE) for R
Quarto (Markdown) – a key platform for reproducible reports and documents
A standard tool in many industries, including science, often to store data
Can be a useful complement to R for data cleaning and simple calculations
A stepping stone to more advanced tools?
R
A free, open-source programming language
Widely used for data analysis and statistics
Standard tool in scientific research
Extensive collection of packages for data science
Strong support for creating publication-quality graphics
Large, active community for help and resources
Why R?
Built for beginners
Makes your work reproducible
Powerful yet accessible
Importantly – the skills you learn are highly transferable to other tools and languages.
Most easily integrated with generative AI tools – more on this soon
Well-documented and discussed online (so you can find help easily)
RStudio
NOT the same as R – it’s an integrated development environment (IDE)
Runs R (…and Python, and SQL, and more)
Makes it easier to write and run R code by providing a significantly more user-friendly interface
Starting with R
It’s normal to feel overwhelmed at first
We’ll learn step by step
Practice is key - a little bit each day helps
Don’t hesitate to ask questions!
Satisfying when it works
Click to see the code for this animation
# Load required packageslibrary(gapminder) # Dataset of country statistics over timelibrary(gganimate) # For creating animations in ggplotlibrary(tidyverse) # Collection of data science packages# Create an animated plot showing how life expectancy relates to GDP# across different continents over timeggplot( gapminder,aes(gdpPercap, lifeExp, # GDP per capita vs life expectancysize = pop, # Point size represents populationcolour = country )) +# Each country gets its own colorgeom_point(alpha =0.7, # Semi-transparent pointsshow.legend =FALSE ) +# Hide legend for cleaner lookscale_colour_manual(values = country_colors) +scale_size(range =c(2, 12)) +# Set min/max point sizesscale_x_log10() +# Log scale for GDP (wide range)facet_wrap(~continent) +# Separate plot for each continentlabs(title ="Year: {frame_time}",x ="GDP per capita",y ="Life expectancy" ) +transition_time(year) +# Animate through yearsease_aes("linear") # Smooth transitions